Using LLMs for Training Data Preparation with Nihit Desai

Machine learning models learn patterns and relationships from data to make predictions or decisions. The quality of the data influences how well these models can represent and generalize from the data.

Nihit Desai is the Co-founder and CTO at Refuel.ai. The company is using LLMs for tasks such as data labeling, cleaning, and enrichment. He joins the show to talk about the platform, and how to manage data in the current AI era.

Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .

 

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

 

Sponsors

As a listener of Software Engineering Daily you understand the impact of generative AI. On the podcast, we’ve covered many exciting aspects of GenAI technologies, as well as the new vulnerabilities and risks they bring.

HackerOne’s AI red teaming addresses the novel challenges of AI safety and security for businesses launching new AI deployments.

Their approach involves stress-testing AI models and deployments to make sure they can’t be tricked into providing information beyond their intended use, and that security flaws can’t be exploited to access confidential data or systems.

Within the HackerOne community, over 750 active hackers specialize in prompt hacking and other AI security and safety testing.

In a single recent engagement, a team of 18 HackerOne hackers quickly identified 26 valid findings within the initial 24 hours and accumulated over 100 valid findings in the two-week engagement.

HackerOne offers strategic flexibility, rapid deployment, and a hybrid talent strategy. Learn more at Hackerone.com/ai.

WorkOS is a modern identity platform built for B2B SaaS.

It provides seamless APIs for authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

It’s a drop-in replacement for Auth0 (auth-zero) and supports up to 1 million monthly active users for free.

It’s perfect for B2B SaaS companies frustrated with high costs, opaque pricing, and lack of enterprise capabilities supported by legacy auth vendors. The APIs are flexible and easy to use, designed to provide an effortless experience from your first user all the way to your largest enterprise customer.

Today, hundreds of high-growth scale-ups are already powered by WorkOS, including ones you probably know, like Vercel, Webflow, and Loom. Check out workos.com/SED to learn more.

RudderStack is the Warehouse Native Customer Data Platform. With RudderStack, you can collect data from every source, unify it in your data warehouse or data lake to create a customer 360, and deliver it to every team and every tool for activation. RudderStack provides tools to help you guarantee data quality at the source, ensure compliance across the data lifecycle, and create model-ready data for AI/ML teams. With RudderStack, you can spend less time on low-value work and more time driving better business outcomes. Visit Rudderstack.com/SED to learn more.

 

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.